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Abstract 



We establish subgeometric bounds on convergence rate of general Markov pro- 
cesses in the Wasserstein metric. In the discrete time setting we prove that the 
Lyapunov drift condition and the existence of a "good" <i-small set imply subge- 
PLh ! ometric convergence to the invariant measure. In the continuous time setting we 

obtain the same convergence rate provided that there exists a "good" (f-small set 
and the Douc-Fort-Guillin supermartingale condition holds. As an application of 
our results, we prove that the Veretennikov-Khasminskii condition is sufficient for 
subexponential convergence of strong solutions of stochastic delay differential equa- 
^ . tions. 
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1. Introduction 

In this paper, we study rate of convergence of Markov processes to an invariant measure 
in the Wasserstein metric. We establish subgeometric bounds on the convergence rate, 
thus generalizing the results of [U El E]. We apply the obtained estimates to prove sub- 
geometric ergodicity of strong solutions of stochastic differential delay equations (SDDEs) 
under Veretennikov-Khasminskii type conditions. This extends the corresponding results 
[25J [261 HS1 S] for stochastic differential equations (without delay). 

There are quite a few works which deal with convergence of Harris recurrent Markov 
chains in total variation (see, e.g., the monograph [16] and the references therein). Less 
is known about convergence of Markov chains that are not Harris recurrent. Recall ([12]) 
that if a Markov chain has a unique invariant measure, then either (a) the chain is positive 
Harris recurrent in an absorbing set and the invariant measure is non-singular or (b) the 
invariant measure is singular and there are no Harris sets. It is quite clear that in case 
(b) the marginal distributions of the Markov chain do not converge in total variation, 
whereas they might converge weakly (and, hence, in the Wasserstein metric). Thus, for 
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non-Harris chains (case (b)) it is natural to study convergence in the Wasserstein metric 
(rather than in the total variation metric). 

Many interesting Markov processes fall into case (b). For instance, following [TT] . 
consider SDDE: 

dX{t) = -cX(t) dt + g{X{t - 1)) dW{t), t > 0, 

where c > 0, W is a one-dimensional Brownian motion, and g is a strictly increasing 
positive bounded continuous function. One can show that the strong solution of this 
equation has a unique invariant measure and converges to it weakly, but not in total 
variation. On the other hand, the Wasserstein distance between X(t) and the invariant 
measure decays exponentially to zero as t — > oo. Section [3] contains further examples of 
processes belonging to case (b). 

Many methods of estimation of convergence rates in the total variation metric assume 
that a Markov process is ^-irreducible and are based on the analysis of small sets. Prob- 
ably, one of the first results in this area is due to Dobrushin [3], who proved that if the 
whole state space is small then a Markov chain is exponentially ergodic. Later Popov 
[2D] and Nummelin and Tuominen [T7], replaced the global Dobrushin condition with a 
combination of a local Dobrushin condition (existence of a "good" small set) and the 
Lyapunov drift condition (LDC). This result was further extended by Jarner and Roberts 
[T3] and Douc and coauthors [5], who established polynomial and general subgeometric es- 
timates of convergence rate, correspondingly. Similar results for continuous time Markov 
processes (under an additional assumption that the state space is locally compact) are 
due to Fort and Roberts [7] and Douc, Fort, and Guillin [4]. The latter work provides 
subgeometric estimates of the convergence rate under condition that a certain functional 
of a Markov process is a supermartingale. Let us also mention the recent paper of Hairer 
and Mattingly [10], which contains a new simple proof of the exponential ergodicity of a 
Markov process under LDC and the local Dobrushin condition. 

Thus, many techniques rely on the irreducibility of a Markov process, the existence 
of a "good" small set, and (for continuous time processes) the local compactness of the 
state space. However, if the state space is infinite-dimensional, then in most "typical" 
situations the process is non-Harris and, therefore these assumptions are not fulfilled. For 
instance, if we go back to the above SDDE, then it is easy to check that this processes is 
not ^-irreducible, the state space is not locally compact and, as was pointed in [TT], all 
small sets of this process are degenerate (i.e., consists of no more than one point). 

An alternative to the local Dobrushin condition was suggested by Bakry, Cattiaux, and 
Guillin in [TJ. They obtained estimates of convergence rate in the total variation metric, 
provided that the LDC holds and a Markov process has a unique invariant measure, which 
satisfies a local Poincare inequality on a large enough set. 

Let us discuss another alternative to this set of assumptions, which was developed 
by Hairer, Mattingly, and Scheutzow ([TT]) specifically for establishing exponential con- 
vergence rates of SDDEs, stochastic PDEs, and other infinite-dimensional processes in 
the Wasserstein metric. Exploiting a new notion of a <i-small set (a generalization of the 
notion of a small set), in conjunction with the LDC, and without any additional assump- 
tions on the irreducibility of the process, the authors proved the existence of a spectral 
gap in a suitable norm, and, hence, the exponential convergence to stationarity. 

We extend this result and consider the more general situation where a spectral gap 
may not exist. For discrete time Markov processes (Theorem 12. ip we prove that existence 
of a "good" d-small set and the LDC implies subgeometrical convergence in the Wasser- 
stein metric. In the continuous time setting (Theorem I2.4p we obtain the same rate of 
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convergence provided that there exists a "good" <i-small set and the Douc-Fort-Guillin 
supermartingale condition holds. Thus, we also extend the results of [U [5] . 

We apply our conditions to study the asymptotic behavior of strong solutions of SD- 
DEs. We prove that Veretennikov-Khasminskii type conditions are sufficient for subex- 
ponential ergodicity (Theorem I3.3p . This extends the results of [251 E51 EI51 II]- 

The rest of the paper is organized as follows. Section 2 contains definitions and the 
main results. Applications to SDDEs and to an autoregressive model are presented in 
Section 3. The proofs of the main results are placed in Section 4. 

2. Main results 

Let X = (X n ) n( zz + be a homogeneous Markov chain on a measurable space (E, 13(E)) 
with transition functions P n (x,A) := P x (X n G A), where x G E, A G B(E), n G Z + . 
As usual for n = 1 we will drop the upper index and write P(x,A). For a measurable 
function /: E — > [0, oo) let V/(E) be the set of probability measures on (E, B(E)) which 
integrate /. We will write V(E) for the set of all probability measures on (E, 13(E)). 
If /i G Vf(E), denote fi(f) := f E f(x) jj,(dx). We define Markov semigroup operators as 
usual 

Pip(x):= I <p(t)P(x,dt), Pfi(dx):= / P(t,dx) /i(dt). 

J E JE 

Recall (see, e.g., [2J) that if d is a semimetric on E, then the Wasserstein semidistance 
Wd between probability measures /x, v G V(E) is given by 

W d (n,u):= inf / d(x , y) X(dx , dy) , 
\ee(ti,v) j ExE 

where C(fi, v) is the set of all probability measures on (E x E, B(E x E)) with marginals 
H and v. If d is a proper metric then Wd is a distance. 

We consider also the total variation metric on the space V(E), which is defined by 
the following formula: 

d TV (^v):=2 sup \fi(A) -v(A))\, eV(E). 

AeB(E) 

Recall that if the space E is equipped with the discrete metric d (x,y) := I(a; ^ y), 
x,y G E, then the Wasserstein distance is just the half the total variation distance, i.e., 
W d0 (^ v) = drviv, u)/2, /i, v G V(E). 

Definition 2.1. A set A G B(E) is called small for a Markov operator P if there exists 
£ > such that for all x,y G A 

~<bv(P(x,-),P(y,-))^l-e. 

For instance, any one-point set is small. However, as discussed above, a Markov 
process might have no small sets that consist of more than one point. To study such 
Markov processes Hairer, Mattingly and Scheutzow [TTJ introduce the following concept. 

Definition 2.2. A set A G B(E) is called d-small for a Markov operator P if there exists 
e > such that for all x,y G A 

W d (P(x,-),P(y,-))^(l-e)d(x,y). 
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Note that our definition of a d-small set is a bit different from the definition of [11J. 

If d(x, y) = l(x 7^ y), then the notions of a small set and a (i-small set coincide. In the 
general case, the latter notion is much weaker than the former. In Section 13.11 we give an 
example of a Markov operator P that has a <i-small state space and no nontrivial small 
sets. 

Before we present our main result, let us recall that the total variation metric is 
contracting, i.e., for any Markov semigroup (P*)^o one has 

d TV (P\x, •), P f (y, •)) ^ d TV (P s (x, •), P s (y, •)), x,y e E 

whenever ^ s ^ t. In general, the Wasserstein metric Wd may not be contracting. 
However, as discussed in detail in [TT], it is natural to focus only on Wasserstein metrics 
that are contracting for the process X, since, in the general case, the Lyapunov drift 
condition is not sufficient even for a weak convergence towards the invariant measure. 
Note that the contractivity condition itself does not imply any convergence at all, either. 
It is the combination of the contractivity, the Lyapunov drift condition and the existence 
of a "good" d-small set, which yields the existence and uniqueness of the invariant measure 
and subgeometric convergence in the Wasserstein metric. 
For a function / : M + — > (0; oo) define 

#/(*):= ^ j^du, x>l. 

Since Hf is increasing, the inverse function Hj is well defined. 

Theorem 2.1. Suppose there exist a measurable function V: E — >■ [0; oo) and a metric d 
on E such that the following conditions hold: 

1) V is a Lyapunov function, i.e., there exist a concave differentiable function (p: R + — > 
R + increasing to infinity with <f{0) = and a constant K ^ such that 

PV <y -<poV + K. (2.1) 

2) The space (E,d) is a complete separable metric space. 

3) The metric d is contracting and bounded by 1, i.e., for any x,y G E 

W d (P{x,-),P(y,-))^d(x,y)^l (2.2) 

4) The level set L := {x, y G E : V(x) + V(y) ^ R} is d-small for some R > ip^ 1 (2K), 
i.e., there exists p > such that 

W d {P(x,-),P(y,-))^(l-p)d(x,y) 

for any x,y G L. 
Then the process X has a unique stationary measure n and 

p(V(u))ir(du) ^ K. 



Moreover, for any e > there exist constants C\ and Ci such that for all x G E 
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Remark 2.2. (i) If tp is a linear function, then the rate of convergence is exponential and 
this case is covered by [TTJ Theorem 4.8]. 

(ii) If d(x, y) = l(x ^ y), then the Wasserstein metric coincides with the total variation 
metric and this case is covered by [5J Proposition 2.5]. 

Remark 2.3. Conditions 3 and 4 of the theorem are a bit more general than the cor- 
responding conditions from [HI Theorem 4.8]. Namely, we do not assume here that 
Wd(P(x, •), P(y, •)) ^ (1 — p)d(x,y) for all x,y G E such that d(x,y) < 1. We suppose 
that this inequality is satisfied only for x, y belonging to the sub level set. 

Note that if (p grows to infinity not very rapidly (as x 7 for some < 7 < 1 or slower), 
then the estimate of convergence rate given by ( 12. 3ft can be as close as possible to the 
estimate of convergence rate in the total variation distance obtained in [51 Proposition 
2.5]. Specific examples of convergence rates (polynomial, logarithmic, etc.) for different 
functions <p are given in [51 Section 2.3]. 

While the proof of the theorem is postponed to Section 2J we outline now the main 
steps. 

Sketch of the proof of Theorem \2.1\ To prove the theorem we develop the idea of con- 
structing an auxiliary contracting semimetric ( [3 [101 E] ) • Namely, let I be a semimetric 
on the space E such that d(x,y) ^ l(x,y) for all x,y G E. It is possible to prove (for 
some "good" I) that for any probability measures /jl, v G V^ y{E) 

Wi{Pfji,Pu) < {l-x{w))W l {w), 
where \ is a positive function (this is done in Lemma [4. 3p . Hence 

71-1 

W d (P n ^P n u) ^ W l (P n f i,F n u) s= H(l-x(P i ti,F i v))W l (ji,u). 

i=o 

Of course, since we want to obtain subgeometric estimates of Wd(P n fi, P n u), there is 
no hope that mf^^-p oV {e) x(/ i ; z/ ) is positive (this lower bound was greater than zero in 
[HUnilll], where geometric estimates were obtained). Yet, a good (albeit non-uniform) 
estimate of x(P J+1 /i, P l+l v) can be derived. However, this estimate depends not only 
on Wi(P z fi, PV) but also on n(P l ((p o V")) and v(P l (ip o V)). The latter two expressions 
are unbounded if /x, v are fixed and i runs over positive integers. Fortunately, there are 
sufficiently many integers i such that these two expressions are "small" (Lemma 14.11) . 
This allows to overcome this obstacle (Lemma 14. 4p and obtain subgeometric bounds on 
Wd(P n fi, P n v). The last step is to prove the existence and uniqueness of the stationary 
measure (Lemma 14 . 5 [) . □ 

Now we give a similar result for continuous time Markov processes. Let X = (X t ) t ^ 
be a time-homogeneous strong Markov process and let {Ptjt^o be the associated Markov 
semigroup. Recall ([6, Theorem 2]) that if a Markov process has cadlag paths, then the 
strong Markov property is implied by the Feller property. 

Theorem 2.4. Suppose there exist a measurable function V: E — > [0; 00) and a metric d 
on E such that the following conditions hold: 
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1) V is a Lyapunov function, i. e. , there exist a concave differentiable function <p : M + — > 
M. + increasing to infinity with f(0) = and a constant K ^ such that for allt ^ 0, 
x G E 

t 

E x V(X t ) ^ V(x) -E x j <p(V(X u )) du + Kt. (2.4) 

o 

2) The space (E,d) is a complete separable metric space. 

3) The metric d is bounded by 1 and contracting for all t ^ to, for some to ^ 0, i.e., 
for any x,y G E 

W d (P t (x,-),P t (y,-))^d(x,y)^l 

4) The level set L := {x,y G E : V(x) + V(y) ^ R} is d-small for all R > and all 
t ^ t , i.e., there exists p = p(R,t) > such that 

W d (P\x, ■),!*&,.)) ^ (l-p)d(x,y) 

for any x,y G L. 

Then the process X has a unique stationary measure tt and 7r((poV) ^ K . Moreover, 
for any e > there exist constants C\ and C2 such that for all x G E 

Remark 2.5. (i) The linear case <p(x) = Ax, A > is |TT| Theorem 4.8]. 

(ii) The case where the metric d is discrete, i.e., d(x,y) = l(x 7^ y), is [4.^ Theorem 
3.2]. 

Remark 2.6. (i) Condition 1 of Theorem 12.41 is equivalent to the Douc-Fort-Guillin 
supermartingale condition [H Formula (3.2)], i.e., inequality f )2.4p holds if and only if the 

process Z := {Z t ) t ^ , 

t 

Z t := V(X t ) + j y{V{X u )) du-Kt, t^O 


is a supermartingale with respect to the natural filtration of the process X. 

(ii) Let L be the generator of the Markov process X. If the function V belongs to the 
domain of L and 

LV < -<p o V + K, 

where K > and (p: M. + — > M + is a concave differentiable function increasing to infinity 
with (p(0) = 0, then condition 1 of Theorem 12.41 holds. 

The proof of this theorem is given in Section 4. Let us describe here the main idea. 

Sketch of the proof of Theorem \2.J\ Combining the technique from [U [7J [T5], we find a 
function W : E — » [0; 00) such that 

P t0 W(x) W(x) - viKxWix)) + K 2 , x G E. 



for some positive Ki, K 2 . Therefore, by Theorem I2.1[ the skeleton chain (X nto ) ne z + has 
a unique invariant measure. It is possible to prove that this measure is also invariant for 
the Markov process X and inequality f !2.5p holds. □ 
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Thus Theorems 12.11 and 12.41 suggest a new method for proving results concerning 
subgeometrical convergence. Namely, one needs to find a suitable contracting metric d 
and a suitable Lyapunov function V with <i-small sublevel sets, such that the conditions 
of the theorems hold. It extends the ability of the existing methods by allowing to choose 
the metric d (which might be different from the discrete metric). 

3. Examples and applications 

Let us give some applications of the results of the previous section. The focus here is on 
stochastic delay equations, however it is possible to apply the results of this kind to study 
convergence in the Wasserstein metric for other classes of Markov processes (see, e.g., [TTJ 
Section 5.3] for estimates of convergence rates of stochastic partial differential equations). 

We first recall some terminology from [16J. A Markov chain X = (X n ) n& z + is said to 
be if) -irreducible if there exists a nontrivial measure if> on 8(E) such that for any x G E 
and any set A G 8(E) with if} (A) > one has P x (Ta < oo) > 0, where Ta is the first 
return time to the set A, i.e., T4 := inf{n ^ 1 : X n G A}. 

A set H G 8(E) is called absorbing if P(x, H) = 1 for all x G H, Harris if there exists 
a measure if) on 8(E) with if)(H) > such that for any x G H and any set A G 8(E) 
with if) (A) > one has P x (Ta < 00) = 1. 

An invariant measure ir is called singular if for any x G E there exists an absorbing 
set S x such that x G S x and tt(S x ) = 0. In other words, the Markov chain, whatever the 
starting point is, will remain in the set of 7r-measure 0. 

3.1 Autoregressive model 

Consider the following peculiar AR(1) process, which belongs to case (b). 

Example 3.1. Let X = (X n ) ne % + be an autoregressive process satisfying the following 
equation. 

where ex, 82, ■ ■ ■ are i.i.d random variables uniformly distributed on the set {0, ^j, . . . , ^} 
and X G [0; 1). In other words, to get X n+ \ from X n one needs to take the decimal 
notation of X n (which starts with followed by the decimal point) and insert a random 
digit immediately after the decimal point. Other digits in the decimal notation of X n are 
shifted right by one position. 

Clearly, X is a Markov process with state space (E,S) = ([0; 1), £?([(); 1))). Let d be 
the Euclidean metric on this space (i.e., d(x, y) = \x — y\, x, y G E). One can easily proves 
that the process X has a unique invariant measure tt, which is uniformly distributed on 
the interval [0; 1). Moreover, the sequence {X n } weakly converges to n as n — > 00. 

This autoregression has a number of very interesting and unusual features. First, it 
has a reconstruction property. Namely, if we have just one observation of X n , where 
the integer n can be arbitrarily large, then it is possible to find an initial value X with 
probability 1 by the following simple formula: Xq = {10 n X n }, where {b} denotes the 
fractional part of a real b. In other words, one just needs to shift right the decimal point 
by n positions and drop all the digits which will be on the left of the decimal point. 

Therefore for x,y G E, x 7^ y, the probability measures P(x, •) and P(y, ■) are singular. 
Hence the process X has no nontrivial small sets. On the other hand, the whole state 
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space E is ci-small. Indeed, it is easily seen that Wd(P(x, ■), P(y, •)) ^ \x — j/ 1 / 10, for any 
x,y G E. 

Observe also that the process X is not ^-irreducible, and, furthermore, it has uncount- 
ably many pairwise disjoint absorbing sets. Indeed, it is sufficient to note that for any 
x G E the set S x := {y G E | 3 m, n G Z + : {10 m ?/} = {10 n x}} is absorbing, countable 
and for x,y £ E either S x = S y or S x f] S y — 0. By the same argument, the chain X has 
no Harris sets. Since ir(S x ) = 0, we see that the measure n is singular. 

Finally, let us point out that for any x G E, the sequence P n (x, ■) does not converge 
to 7i in total variation (moreover, dTv{P n {%, •)> tt) = 2 for any positive integer n). On the 
other hand, P n (x, •) converges exponentially to 7r in the Wasserstein metric (moreover, 
Wd{P n {x, •), 7r) ^ 10 _n for any positive integer n). 



3.2 Stochastic delay equations 

In this subsection we present our results on convergence of SDDEs in the Wasserstein 
metric. 

Fix r > 0, positive integers n, m, and let C = C([— r; 0], R n ) be the space of continuous 
functions from [— r; 0] to M™ equipped with the supremum norm || • ||. Following [TT] . 
introduce the following family of metrics on the space C: 

dp(x,y) = 1 A \\x-y\\//3, (5 > 0. 

Consider the stochastic differential delay equation 

dX(t) = f(X t )dt + g(X t )dW(t), t^0 
X = x, 

where /: C — > M n , g: C — )■ M™ xm , W is an m-dimensional Brownian motion, x G C is the 
initial condition, and as usual we use the notation X t (s) := X(t + s), — r ^ s ^ 0. It is 
clear that the process X = (X t ) t ^ defined on the state space (C, B(C)) is Markov. 

Throughout this section we assume that the drift and the diffusion satisfy the following 
conditions: 



• the drift satisfies a one-sided Lipschitz condition and the diffusion is Lipschitz, i.e., 
there exists K > such that for any x,y E C 

2 (f(x) - f(y),x{0) - y(0)) + + \jg{x) - g(y)f ^ K\\x - y\\ 2 ; (3.2) 



the diffusion is nondegenerate, i.e., for any x G C the matrix g(x) admits a right 
inverse g~ l (x) and 

sup |||g _1 (:r) III < oo; (3.3) 

zee 

/ is continuous and bounded on bounded subsets of C. (3.4) 



Here (-, ■) is the standard scalar product in IR n ; for a real b we write b + := max(6, 0), and 
II M||| denotes the Frobenius norm of a matrix M, i.e., |||M||| 2 = £ M*. As in we also 
define 

A+ = sup {g{v)g (u)r-r, j-r), A = sup - 



n 
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Conditions ( 13 .2p and ( 13 .4p imply ([22]) the existence and uniqueness of the strong 
solution of SDDE (EH). 

Now we give a general theorem, which describes convergence rates in the Wasserstein 
metric Wd„. Theorem I3.2( i) is a generalization of [TTJ Assumption 5.1]. 

Theorem 3.2. Suppose conditions fl3.2p - fl3.4p hold and there exists a Lyapunov function 
V: C — > R + that satisfies inequality ( 12. 4p . If either 

(i) lim V(x) = oo; 

\\x\\—>oo 

or 

(ii) V(x) = U(x(0)), for some function U: M n — > M. + , lim U(v) = oo, the diffusion 

\v\— >oo 

coefficient is uniformly bounded and the drift coefficient can be decomposed into two 
terms 

f(x) = f 1 (x) + f 2 {x{0)), (3.5) 
where the function f\ is bounded; 

then SDDE (13. ip has a unique invariant measure tt. Furthermore, for any (3 > 0, the rate 
of convergence o/Law(X t ) to it in the Wasserstein metric Wd p is given by ( 12. 5p . 

Proof. Fix > 0. Let us check that the process X and the function V satisfy the 
conditions of Theorem 12.41 It follows from [TTJ Proposition 5.4] and [2U Lemma 3.7.2] 
that the process X is Feller. Since X has continuous paths, we see that X is strongly 
Markovian. The first condition of the theorem is satisfied by assumption. The second 
condition also holds. In case (i) it follows directly from [TTJ, Section 5.2] that there exists 
a 7 G (0; 0) such that the third and the fourth conditions are met. In case (ii), arguing as 
in [TTJ Proposition 5.3 and Lemma 3.8] one can show that the set {x G C : \x(0)\ ^ R}, 
R ^ is cty-small for some 7 G (0; (3) and the metric <i 7 is contracting. Thus, in both 
cases the conditions of Theorem 12.41 are satisfied. 

Apply Theorem 12.41 to the process X. It follows from this theorem that SDDE (13. ip 
has a unique invariant measure ir and the rate of convergence of L&w(X t ) to 7r in the 
metric is provided in ( 12. 5p . To conclude the proof, it remains to note that for any 
measures /xi,/x 2 e V(E) one has W d/3 (fii, fi 2 ) ^ Wd^i,/^)- □ 

Ergodic properties of stochastic differential equations (SDE) were studied by Vereten- 
nikov [251 126], Malyshkin [15], Klokov [H], Douc, Fort, Guillin pE], and many others. It is 
known that the Veretennikov-Khasminskii condition on the drift combined with a certain 
non-degeneracy condition on the diffusion is sufficient for the existence and uniqueness of 
the invariant measure for the strong solution of an SDE. Moreover, these conditions yield 
exponential, subexponential or polynomial (depending on the value of the constant a, see 
below) convergence towards the invariant measure in the total variation metric ([JjJJ H]). 
The following theorem extends these results to SDDE. 

Theorem 3.3. Suppose conditions ( I3.2p - (l3.4p hold, A < 00, and the function fx in 
decomposition (13. 5p is bounded. 

(i) Assume additionally that for some constants a G (0, 1] ; M > 0, r > the general- 
ized Veretennikov-Khasminskii condition holds, i.e., 

(f(x),x(0)) ^ -r|x(0)| a , x G C, \x(0)\ ^ M. (3.6) 
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Then SDDE ( 13.1 j) has a unique invariant measure tt and Law(X t ) converges to tt in 
the Wasserstein metric Wd p subexponentially (if < a < 1) or exponentially (if a = 1), 
i.e., for any (3 > there exists positive constants G\ and C 2 such that 

W d „(P*(av),7r) < C x ^{C x \\x\\ a - C 2 t a/{2 - a) )}, xeC,t>0. (3.7) 

(ii) If (13. 6p holds with a = and r > nA/2, then SDDE ( 13. ip has a unique invariant 
measure tt, but L&w(X t ) converges to tt in the Wasserstein metric Wd p only polynomially, 
i.e., for any (3 > 0, e > there exist C > such that 

W dp (P*(x r ),ir) < (7(1+ ||x|| 2+2ro )r ro+£ , xgC, t >0, 

where ro = (r — nA/2)A+ . 

Proof. The proof is based on the application of Theorem 13.2( h) with a suitable Lyapunov 
function V. (i) Following jTH Section 3] (see also @J Proposition 5.2]), let [/ : R n ->■ [0; oo) 
be a twice continuously differentiable function such that {/(v) = exp{fc|w| Q } for \v\ ^ Mo. 
The parameters M ^ M and k ^ will be chosen later. Take V(a?) = U(x(0)). By Ito's 
Lemma, for any x G C and t > one has 

t 

E x V(X t ) ^ V(x) + akE x J l(\X(s)\ > M )V(X s )\X(s)\ a - 2 (X(s)J(X s ))ds 

o 

i 

+ ±akE x J l(\X(s)\^M )V(X s )\X(s)\ a - 2 (X + ak\X(s)\ a + C 1 )ds 
o 

+ C 2 t 

< V(x) - C 3 akE x [ l(\X(s)\ > M )V(X s )\X(s)\ 2a ~ 2 ds + C 2 t, 



where C\ = \ + (a — 2) + nA, C2 > 0, C3 = r — |A + a/c — ICiMq -0 and in the second 
inequality we made use of ( 13.61) . 

Let ip: M + — > K + be a concave differentiable function with (p(0) = and (p(t) = 
t(\nt) {2a - 2 ^ a for t ^ e 2 . Take fc = and M = (^) 1/q V(§) 1/q VM. Then U(M ) ^ e 2 
and 

t 

2a-2 



E x V(X t ) ^ V(x) - C 4 E X J l(\X(s)\ ^ M )V(X s )\X(s)\ 2a - 2 ds + C 2 t 



t 

= V(x) - C 5 E X J l(\X(s)\ ^ MoMVpQ) ds + C 2 t 


t 

^ V(x) - CsE x J if(V(X s )) ds + C 6 t, 


where C4 := afcr/4, C5 := C$k 2 l a ~ 2 and Ce > 0. Thus the function V satisfies inequal- 
ity (12. 4p . Theorem 13.2( h) now yields the existence and the uniqueness of the invariant 
measure tt and implies estimate (13. 7p . 
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(ii) Now let U(v) = \v\ k , where k > 2. We take V(x) = U(x(0)) and proceed as 
follows: 

t 

E x V(X t ) ^ V(x) + ^kE x J \X(s)\ k - 2 (2(X(s), f(X s )} + (k - 2)A + + nA) ds 

o 

t 

^ V{x)-kC 1 E x J l{\X{s)\ ^ M)\X(s)\ k - 2 ds + C 2 t, 
o 

where d = r - - ?f, C 2 > 0. Set 

2r — nA 
k = 2 + — e. 

where e > 0. By choosing e > small enough we can ensure that k > 2. Take = 
u (k-2)/k_ Then 

t 

E^(Xi) < V(x) - C 3 E X J <p(V(X s )) ds + C 4 t, 

o 

for some C 3l C^ > 0. Thus the function V satisfies condition (12.41) . and the statement of 
the theorem follows now from Theorem I3.2( ii). □ 

Example 3.4. Consider the following peculiar SDDE 

dX(t) = f(X(t))dt + g(X(t - l))dW(t), 

where n = m = 1, the functions / and g satisfies fl3.2l) - fl3.4p . / also satisfies (I3.6P and 
g is a strictly increasing bounded positive continuous function. The strong solution of 
this SDDE also belongs to case (b). This SDDE has the reconstruction property Q23J), 
i.e., if we know X t for any t > 0, then we can reconstruct the initial condition Xq with 
probability one. Hence, the measures P t (x, ■) and P t (y, •) are always singular for any 
t > and x ^ y. It follows from Theorem 13.31 that this SDDE has a unique invariant 
measure n. However, the reconstruction property implies that dry(P*(x, •), n) does not 
converge to as t — > oo and the measure it is singular. On the other hand, if we replace 
the total variation metric dxv by the Wasserstein metric H 7 ^ (these two metrics can be 
arbitrarily close to each other for sufficiently small (3), then we see that Wd (P t (x, -),tt) 
converges to subexponentially. 



4. Proofs of the main results 

To prove Theorems 12.11 and 12.41 we introduce some notation. Consider a semimetric 
l(x,y) := rf(a;,?/) 1 / p (l + 0(f(V(x) + V{y))) 1/q , where > 0, p, q > 1 and 1/p + l/q = 1. 
These parameters will be chosen later. We start with two auxiliary lemmas. 

Lemma 4.1. Assume that a function V: E — y [0; oo) satisfies condition 1 of Theorem 
\2.1\ Then for any n G Z + 

n—l 

J2 pi (¥ oV ) < nK + V. (4.1) 

8=0 

Furthermore, if a measure it is invariant for the process X , then n G V V ov{P) an d 
n(ipoV)^K. 
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Proof. Let us rewrite (12. ip in the following form: ipoV — K^V — PV. Applying the 
operator P l , i G Z + to the both sides of this expression and summing the result over all 
^ i < n, we get 



n-1 



^2 P { (<P ° V) - nK ^ V - P n V, 

i=0 

which proves (14. ip . 

To prove the second part of the lemma we combine the first part of the lemma with a 
cut-off argument (see, e.g., [9\ Proposition 4.24]). Fix L > 0. Then, for any non-negative 
integer i, we have 



((if o V)(x) A L) ir(dx) = / P t ((ifoV)AL)(x)n(dx) 

Je 

< / (P\ipoV)(x) AL)n(dx). 



E 



Summing the both sides of the above inequality over all ^ i < n, we derive 
J ((ifoV)(x) AL)n(dx) < J (Q^J2P l (ipoV)(x)^j AL) n(dx). 
This, combined with ( I4.ip . yields 



((if o V)(x) A L) Ti(dx) I A L) ir(dx). 



n 



Lebesgue's dominated convergence theorem implies that the integral in the right-hand 
side of the above inequality tends to as n — > oo. Thus 



((ip o V)(x) A L) n(dx) ^ K 

and the second part of the lemma follows from Fatou's lemma. □ 
The following Lemma [4.21 is due to Petrov. 

Lemma 4.2 ([21]). Let a , a\, . . . be a sequence of positive numbers and assume that for 
all n G Z + one has 

a n+ i < a n (l - ip(a n )), < a < 1, 

where ip: [0; oo) — > [0; 1] is a continuous increasing function with ip(0) = and ip(x) > 
for x > 0. Then 

a n ^ g- l (n) (4.2) 

for all n G Z+, where 

l 

f dt 

X 

Proof. We see that the function g~ l is well-defined. This follows from the fact that the 
function g is nonnegative, unbounded and strictly decreasing. Since ip is positive, we have 
a n+ i ^ a n . By the mean value theorem, there exists s G [a n +i; a n] such that 

/ \ / \ // \/ \ a n+i a n ip(a> n ) 

g(a n+ i) - g(a n ) = g (s)(a n+1 - a n ) = ^ — -r^— > 1. 

s-0(sj s^>(s) 

Hence #(a n ) ^ n and a n g~ l (n). □ 
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The next key lemma gives the estimate of the contraction rate in one step. 

Lemma 4.3. Assume that the conditions of Theorem \2.1\ hold. Then there exist (3 = 
f3(p, q) and positive ci(p, q), c 2 (p, q), c 3 (p, q) such that for any /i, v G V^oviE) one has 

WtPfrPu) ^ (l-c 1 Ac 2V '(^- 1 (cW l ( f i,u)-^W l (fi,u), 

where c := C3 ° V) + v{ip o V)) p and the semimetric I was introduced at the beginning 
of this section. 

Here, as usual, a A b = min(a, b) and a V b = max(a, b) for real a, b. To simplify the 
formulas, we will drop a pair of parentheses and write 1 — a A b for 1 — (a A b). 

Proof. We start as in the proof of [TTJ Theorem 4.8], by observing that since Wi is convex, 
the Jensen inequality implies 

WiiP^Pu)^ J W l (P(x,-),P(y,-))a(dx,dy) (4.3) 



ExE 



for any /x, v G V vo v(E) an d any a G C(fi,u). Applying the Cauchy-Schwarz inequality 
and the Jensen inequality for concave functions, we find that 



Wi(P(x,-),P(y,-)) = M j l(u,v)X(du,dv) 

ExE 

^ inf (J d(u, v) X(du, dvfj VP (} + pJ <f{V(u) + V{v)) X(du, dv) 

ExE ExE 

< W d (P(x, •), P(y, -)) 1/p (1 + P<p(PV(x) + PV{y))) l/ \ (4.4) 



where the infimum is taken over all measures A G C(P(x, •), P(y, •)). 

To estimate the right-hand side of the last inequality we consider three different cases. 
Note once again that contrary to the proof of [TTJ Theorem 4.8] it is impossible here to 
obtain a non-trivial uniform bound for Wi(P(x, •), P(y, -))/l(x, y). 

Fix a large M > R. 

Case 1. V(x) + V(y) ^ R. In this case we proceed similar to [H [11]. Using (I4.4|) and 
conditions 1 and 4 of the theorem, we obtain 

W t (P(x, •), P(y, •)) < (1 - P) 1/p d(x, y)V*{\ + fo{2K + R)) 1 ^. 

Setting 

(l+p/(2-2p))"- 1 -l 
<p(2K + R) 

we get 

W l {P(x,-),P{y,-)) ^ {l-p^d^y) 1 '* < (1 - p/2p)l{x,y). 

Case 2. R < V{x) + V(y) ^ M. In this case we make use of ( 12. ip and the concavity 
of to derive 

^(P\/(x) + PVfo)) < ^(V^) + V(y) - V {V{x)) - V {V{y)) + 2K) 

^ <p{V{x) + V(y) - <p{V{x) + V(y)) + 2K). (4.5) 



13 



Clearly, if u 6 (P; M] then again by the concavity of ip we have 

<p(u - <p(u) + 2K) <: <p(u) (1 - (tp(u) - 2K)^\) 

^<p(u)(i-e<f/(M)), 

where 9 := 1 — 2K/ip(R). This inequality, combined with ( 14. 4p . (14.5)) . and contraction 
property ( 12. 2p . yields 

w,(p(ar, •), P(y, •)) < y) 1/p ( i + ^(py(x) + 



^ y) 1 ^ ( 1 + 0<p(V(x) + V(y))(l - 9ip'(M)) " ' " 



Case 3. V{x) + > M. This is the easiest situation because in this case we 

would like to derive a very weak estimate of Wi(P(x, -),P(y, •)). Combining (12. 2p . (I4.4j) 
and (14.5)) . we get 

Wi(P(s,.),P(2/,0) 

< d(ar, (1 + P<p(V(x) + V(y) - <p(V(x) + V(y)) + 2K)) 1/q 

^d(x,y)^(l + MV(x) + V(y))) 1/q 
= l(x,y). 



Now we return to the main line of the proof. Introduce 

Cl = c x (p, q, R, K) := J^ R) y c 2 = p/2p. 

Note that the values of c\ and c 2 depend neither on the choice of M nor on measures 
(jl and v. We see from (14.31) and the above estimates of Wi(P(x, •), P(y, •)) that for all 
M > R one has 

Wi(Pn, Pu) < (1 - ca A ci^'(M)) / y) a(tto, + 

+ (c 2 A CiV9'(M)) / l(x,y)a(dx,dy). (4.6) 

i{y(x)+y( ?/ )>j\/} 

The second integral on the right-hand side of (14.61) is estimated using Chebyshev inequal- 



14 



ity. Namely, 

l(x, y) a(dx, dy) 

V(x)+V(y)>M} 

< / (l + P<p(V(x) + V(y))) 1/9 a(dx,dy) 

J{V(x)+V(y)>M} V 7 

^ C I <f(V(x) + V(y)) 1/q a(dx, dy) 

J{V(x)+V(y)>M} 

^Ctp(M)- l/p [ ip(V(x) + V(y))a(dx,dy) 

JExE 

^ Cif(M)- 1/p (fi(ip oV) + u(<p o V)), 

where C = 1/K + /3 + 1 and in the second inequality we used the bound <£>(M) > K. Note 
that fi((p o V) as well as u(ip o V) are finite because it was assumed that /i, v G V^oviE). 

Recall that a is an arbitrary element of C(/i, v). Hence we can take the infimum over 
all a G C(fx, v) in (14.61) and use the above inequality to derive 

Wi(Pfi, Pv) < (1 - ca A ci/(M))W,(/i, v)+ 

+ C(c 2 A Cl ip'{M)) (fi(tp o V) + v(<p o V))^(M)- 1/p . (4.7) 

Now we can choose M in such a way, that the right-hand side of the above expression is 
always smaller than Wi(n, v). Namely, it is sufficient to require that 

C((jt(<p oV) + v(<p o V)MM)~ 1/p < Wi(fji, v)/2. 
This inequality holds for 

M = yT 1 (cM V o V) + v{y o ^ p 



where c 3 = c 3 (p, g, i?, K) = 2 P (1/K + (3 + l) p . The substitution of the last expression into 
(14. 7p proves the lemma. □ 

Lemma 4.4. Assume that the conditions of Theorem \2.1\ are satisfied. Let /i, v G V vo v{E) 
and let (nk)k £ x + be an increasing sequence of positive integers such that for all k G Z + 

P nk fi(<p oV) + P nk v{ip o V) < C(ji, u), 

where C(/i, v) ^ 1. Then there exist positive Ci, C*2 £/iai do not depend on fi, v such that 
for all k G Z + 

W (F>, < £7, CO., .O—L^. (4.8) 

Proof. We begin by observing, that for any measures Ci ; C2 G Vtp v{E) one has 

Wi(Ci,Ca)< / (l+W^)+^(l/))) 1/ff CiWCaW 

JExE 

V {V{x) + V{y))C l {dx)Udyf 1 '' 1 

' ExE 



^(l + P( 1 &oV)+f3( 2 &oV)) 1 /' 1 , 
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where we used the concavity of the function (p and the bound d ^ 1. Hence, 
Wi(P no ^ P no v) < (1 + /3P n V(^ o V) + /3P n °v(<p o 

< (1 + /3C(/z, z/)) 1 ^ ^ (i + i/). (4.9) 

Introduce Co := 1 + (3 and denote 

It follows from Lemma 14.31 that ^ a n +\ ^ «n for all n G Z + . Besides, by definition 
and (14. 9 p we have a m ^ 1. The function tp' is decreasing, therefore using Lemma |4~3| we 
derive 

°n fc+1 ^ a n k +l 

^ (l - a A c 2 ^ / (^~ 1 (c 4 a^)))a nfc , 

where C4 = Cq P C3. Since a no ^ 1, it is possible to apply Lemma 14.21 to the sequence 
(a nfe )fc e z + - It follows from (14.21) that a nk ^ g _1 (fc), where 

1 C6 

/" (it f dt 

9{X) = J Cl tAc 2 t^(^(c 4 t- p )) =C5 J tv'^- 1 (c 4 t-P)) +C7 

X X 

p-^CiX-P) 

/du 
—— + C 7 = C 8 H ip ((f~ 1 (c 4 X~ P )) + C10 

C9 

and C5, C6, ... are some positive constants. Note that to obtain the third identity we made 
the change of variables u = Lp~ 1 (cit~ p ). Thus we finally get a nk ^ cn(p(H Z 1 (ci2k))~ 1 / p 
and hence 

This completes the proof of the lemma. □ 

Lemma 4.5. Under the conditions of Theorem \2.1\ the process X has a unique stationary 
measure it. 

As was pointed by the referee, if we additionally assumed that the sublevel sets of V 
are compact and the process X is Feller, then the proof of the lemma would be trivial. 
Indeed, in this case the statement of the lemma would follow directly from the Krylov- 
Bogoliubov theorem, see [8j p. 20]. However, we do not make this assumption, because 
we would like to apply Theorem 12.11 to Markov processes with a non locally compact 
state space (in particular, to strong solutions of stochastic delay equations defined on 
C([-r; 0],R n ), see Section E2J. 

Proof. First let us prove the existence of a stationary measure. Fix x G E. Let us verify 
that the sequence of measures (P n S x ) ne z+ has a Cauchy subsequence. For n < m G Z + 
define: 

A(n,m) := #{i G [n; m) : P\ip o V)(x) < AK + 4V(x) + 1}, 
B(n,m) := 

#{2 G [0;n) : o V)(x) V P m - n+ % o V)(x)) < 4AT + 4V (z) + 1}. 
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Here the symbol # denotes the cardinality of a finite set. It follows from the above 
definitions that for n < m 

B(n,m) ^ A(0,n) + A(m - n,n) - n. (4-10) 

Introduce the following sequence. Let r_i = — 1 and for k G Z + 

r k := inf {s > r fe _! : (P% o V)(x) V P m ~ n+S (ip o V)(x))^ 4K + AV(x)}. 

We see that rB( n ,m)-i < n - We apply Lemma H3 to the sequence (r k ) k ez+, the measures 
4 and P m -"4 and take C(S X , P m - n S x ) = AK + W{x) + 1. Then, by (|4~8j) . 

V^(P n ^,P m ^) sc ^(P rs ^)- 1 5 X) P rs c»^)- 1 (P m - n 5 :c )) 

^ Ci(4K + 4V(x) + lMH-\C 2 B(n, m) - C*))- 1 '*, (4.11) 

where we used Lemma [4.31 to obtain the first inequality. Recall that the constants C\, C 2 
are independent of n, m. 

It follows from (14. ip that for any fixed n there exists an arbitrary large m such that 
A(mn, (m + l)n) 3n/4. Since A(0,n) ^ 3n/4, inequality (I4.10p implies that for any 
fixed n there exists an arbitrary large m such that B(n,m) ^ n/2. It is clear, that for all 
such m one has 

WKP n ^, P m 5x) < Ci(4tf + 4V(x) + lMH-\C 2 n/2 - C 2 ))- l ' p =: *(n). 

It is evident, that \I/(n) — > 0, as n — >■ oo. 

Now we can construct the desired Cauchy subsequence. We set n Q = and for fceZ + 

n k+l := inf{m > n k : B(n k ,m) > n fe /2 & \P(m) ^ e~ (fc+1) }. 

By the above arguments, we see that the sequence (nk)kez + is well-defined, B(n k , rik+i) ^ 
n k /2, and \I/(n fe ) ^ e~ fc . Now we claim that the sequence (P nk S x ) ke z+ is a Cauchy sequence 
in the space (V(E), Wd). Indeed, using (14. lip and the definition of n k we derive 

k+m— 1 

W d {P n H x ,P n *+™5 x )^ W d (P^5 x ,P n ^5 x ) 

i=k 
k+m— I 

*c ^ w^p^p^s^ 

i=fc 

fc+m— 1 fc+m— 1 

i=k i=k 

for all integers k, m. Since the space (V(E), Wd) is complete (see, e.g., [2J Theorem 1.1.3]), 
we see that there exists a measure n G V(E) such that H^(P nfe (5 :r , 7r) —> 0. 

Let us verify that the measure tc is stationary, i.e., let us check that Pit = ir. Note, 
that the metric Wd is contractive. Indeed, for any /i, v G V(E) we have 

H/ d (P M ,P^)^ inf / M/ d (P(x,-),P(y,-))A(da;,dy) 

^ inf / d{x,y) X(dx, dy) 
xecfav) J ExE 

= Wain, v), 
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where we used the Jensen inequality and condition (12. 2p . 
Therefore for any k G Z + we obtain 

W d (Pn, tt) ^ W d (Pir, P nk+1 6 x ) + W d (P n *5 x , P rik+1 6 x ) + W d (P n *5 x , tt) 

^ 2W d (n,P n *6 s ) + ^0P nfc 4,P nfc+1 (y. (4.12) 

The first term on the right-hand side of the last expression tends to 0, as k — > oo. To 
estimate the second term, we observe that if n is a positive integer then A(0,n) ^ 3n/4 
and A(l,n + 1) ^ 3n/4 — 1. Therefore, inequality (I4.10p implies B(n,n + 1) > n/2 — 1. 
This, combined with (14. lip , yields 

Wi(P nk 5 x , P nk+1 5 x ) ^ d(AK + AV(x) + 1] 



ip(H-\C 2 n k /2-2C 2 )y/v- 

Hence Wi(P nk 5 x , P nk+1 S x ) ->■ as k -> oo, and we conclude from < KT2\i that W d (Pn, tt) = 
0, which implies the stationarity of the measure tt. 

To complete the proof of the lemma it remains to prove the uniqueness of stationary 
measure. Suppose that, on the contrary, the process X has two stationary measures tti 
and 7r 2 and t\\ ^ n 2 . By Lemma l4"7Tj 7r 1 ,n 2 G V^ov^E) and hence < Wi{ki,-k 2 ) < oo. 
We make use of stationarity of the measures and Lemma 14.31 to obtain 

Wi(7Ti,7r 2 ) = W^P^P**) < W,(7ri,7r 2 ). 
This contradiction proves the lemma. □ 

Proof of Theorem \2.1[ It follows from Lemma 14.11 and Lemma 14. 5[ that the process X 
has a unique stationary measure tt G V vo v(E) and n({po V) ^ K. Fix x G E and consider 
the following sequence. Let n Q = and 

n k+1 := inf {m > n k : P m (tp oV)^2K + 2V(x) + l}, k G Z + . 

We make use of stationarity of it, the bound 7r(y9 oV)^K and the definition of n k to 
derive 

P" fe 5 x (^ ol/) + P nk ir(ip oV) — P nk 5 x (ip oV) + ir((p oV) ^3K + 2V(x) + 1. 

Let us apply Lemma H~4l to the measures 6 X , n, to the sequence (n k )k& + and take C(S X , n) = 
3K + 2V(x) + 1. Clearly, C(S X , tt) > 1. It follows from (jMD that 

W l (P nk 5 x ,Jc)^C 1 (3K + 2V(x) + l] 



v (H~\c 2 k)y/p- 



On the other hand, it follows from (14. ip . that n k ^ 2k. To conclude the proof, it remains 
to take 1/p = 1 — e and note that 

W d {P 2k S x ,ir) « W,(P 2 '%,x) = W,(P 2k S x ,P'"'-'"-T,) < W^'K^) 
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To switch from discrete time to continuous time and prove Theorem 12.44 we combine 
different methods from [U [7J [18]. First of all for a set C G B(E), introduce the hitting 
time delayed by 5 > 

r c {5) := infO ^ S : X t G C}, 
and the hitting and return times of the skeleton chain: 

cr mjC := inf{n G Z + : X mn G C}; 

T m)C ■= inf{n G Z+, n ^ 1 : X m „ G C}, 

where m > 0. Denote for brevity Cr := {V(x) ^ i?}. 



Lemma 4.6. If R > and <£>(-R) > -ft' t/ien under the conditions of Theorem 2.4 

E,r { v (a:) ^ } (5) < - - 

/or all x E E and 5 > 0. 

Proof. Fix L > 5. Observe that if 5 ^ it < tc h (5) then by definition V(X U ) ^ -R. 
Combining this with (12. 4 p we obtain 

E x (r Cfl (5) AL) = 6+E x J du^5+^-E x J V (V(X u ))du 

8 ' 6 

V(x) + KE x (t Cr (6)AL) 

Therefore 



- K 

The desired inequality follows now from the Fatou lemma. □ 
Lemma 4.7. Let m > 0. If R> Km and <f(R — Km) > K then under the conditions of 



Theorem 2.4 



E x T m ^ CR < c 1 V(x) + Q2, x G £, 

where C\ = Ci(m,R, K) and c<i = C2(m, R, K) are positive functions that do not depend 
on x. 

Proof. The proof of the lemma uses the ideas from the proof of [H Proposition 22(h)] . 
However, note that we cannot apply this proposition directly because in contrast to Fort 
and Roberts we assumed neither that the set {V(x) ^ R} is petite nor that the process 
X is Harris- recurrent with invariant measure. 

Introduce R' < R — Km such that ip(R') > K. The existence of such R' follows from 
the conditions of the lemma. Consider the following sequence of stopping times: 

r° := 0, r 1 := T Cgf (m), r n := mf{t > r n ' 1 + m : X t G C R ,}, 

and let M := sup.,. 6C , E x Tc R/ (rn). By Lemma |4T6| 

my(R') + R> 
^ ip(R') - K ' 
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For n G Z + , n ^ 1 define Z n := I{Xp T n/ m -| m G C^}, where [6] denotes the upper integer 
part of a real b. By definition, Z n G J- T n+i, where we denote J-f := <x{X s , ^ s ^ i}. We 
combine the strong Markov property, the Chebyshev inequality and (12. 4 j) to obtain 

P(^n = 1kt«) = 1 — Ex r „ I{X|Yn/ m -| m _ T n ^ Cr} 

\/(X r .) + ATm 

R-R'-Km , 
> ^ =: 7- (4-13) 

It follows from the choice of 72' that 7 > 0. 

Introduce rj := inf{n G Z + ,n ^ 1 : Z n = 1}. Using strong Markov property, (14.131) 
and following the same lines as in the proof of [T8J Lemma 3.1] we get for n ^ 1 and 

x G C R > 

E x r n I(rj^n) < E x T n - l l(r]^n~l)E(l(Z n ^ =0)| T^-i) 
+ Ea-Ifa > n- l)E(r n - r"" 1 ) j;„-i) 
^ (1 - 7)E x r n - 1 % ^ n - 1) + (1 - 7 )™- 1 M. 

Since E^r ^ ^ 0) is obviously zero, by induction we establish the following estimate 

E x r n l(ri^n)^nM(l-'y) n - 1 , xeC R . 

Thus we have 

00 M 
E x r^ ^ E * r " Kv>n)^—, xeC R/ . 

n=l ^ 

We combine this with Lemma 14.61 to finally obtain 



mE x T m , CR ^ E x t 1 + E x E Xt1 t^ + 



m<f(R') + V(x) rwp(Bf) + R' 

<C -U -U 777 

^ <p(R) -K l 2 {f{R') -K) 

< c x y(x) + c 2 

for all x & E. This concludes the proof of the statement. □ 



Proof of Theorem \2.J\ First let us prove that there exist a Lyapunov function W : E — >■ 
[0, 00) and positive constants K\, K 2 such that 

P t0 W(x) ^ W(x) - <p(KiW(x)) + K 2 , x G £. (4.14) 

Choose a sufficiently large R (such that the conditions of Lemma 14.71 hold with m = to) 
and let 

W(a;) := E, £ V>{V{X kto )). 
It follows from [T6| Theorem 11.3.5 (i)] that for x G E 

P*W{x) = W(x) - <p{V{x)) + l{x G C R )E X viV{X kt0 )). (4.15) 

k=l 
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Using an argument similar to that in the proof of jU Proposition 4.8(i)], we obtain for 
any L > and x G E 



E x J2 v(l + V(X kt0 ))-E x / V {l + V{X stQ ))ds 



^^'(l)Kt E x (T t(hCR AL). 

Furthermore, using condition (12.41) and the concavity of the function tp, we get for any 
x e E 

E x J <p{l + V{X st0 ))ds = ^-E x J ip{l+V{X u ))du 



'o^to, c r MqL 

du 



^ tp(l)E x (T t0 , CR A L) + ±-E x J cp(V(X u )) 

o 

^ V(x)/t Q + (<p(l) + K)E x (T t0tCR A L). 



Combining this with the previous inequality and using Lemma 14.71 and Fatou's lemma, 
we derive for any x G E 



T to,c R 



E x <p(V(X kt0 ))^V(x)/t + ( V (l) + K + <p'(l)Kt )E x T tOtCR 



k=l 



^ V{x)/t + c 3 { Cl V{x) + c 2 ) s= c 4 V{x) + c 5 , (4.16) 



where c\ and c 2 are defined in Lemma l4~?] C3 := ip(l) + K + ip'(l)Kto, C4 := 1/to + C1C3, 
c 5 = c 2 c 3 . Therefore, by the concavity of ip, 

W(x) ^ <p(V(x)) + c 4 V(x) + c 5 ^ V(x)(tp'(l) + c 4 ) + ip(l) + c 5 . 

This bound, together with (I4.15P and (14.161) . yields 

P t0 W(x) ^ W(x) - ip(c 6 W(x)) + c A R + c 5 + c 7 

for some positive c 6 , c 7 . Hence the function W satisfies (14.141) . 

Now the statement of Theorem 12.41 follows from the corresponding statement for 
discrete time chains. Indeed, the application of Theorem 12.11 to the skeleton chain 
(X nto ) n< zz + yields the existence of a measure n such that P'°tt = ir. Note that for any 
< s < to the measure ir s := P s n is also invariant for this skeleton chain. Indeed, 
P to 7i s = P' 0+s tt = P s P to n = tt s . On the other hand, Theorem 12.11 yields uniqueness of 
the invariant measure. Thus, P s n = 71 and the measure n is invariant for the process X. 
Arguing as in the proof of Lemma [4. 1[ we see that 7r((p o V) ^ K. 

It follows from Theorem 12. II that for any e > there exist constants C±, C2 such that 
for all x G E, n G Z + 



W d (P nt0 (x,-),7T) S= 
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We combine this with condition 4 of the theorem to conclude that for any t > t 

W d (P\x, •), tt) = W d {P\x, ■), P^-L'AoJ*^) 
< ^(p(L*/M-i)*o (X) . );7r) 

da + nx)) 

^ ¥> (fl--i(Ci*)) 1 - e 

for some C3 > 0. Here |_^J denotes the lower integer part of a real b. This completes the 
proof of Theorem 12.41 □ 
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